FB2008_06 Release Notes
General FlyBase
FB2008_06
Statistics
|
|
|---|---|
|
Number of references in FlyBase
188933
Number of research papers
79689
Number of abstracts
37379
Number of personal communications to FlyBase
4110
Number of fly stocks
86625
Number of fly images
981
Drosophila workers registered with FlyBase
7441
|
|
Drosophila melanogaster (R5.9)
|
|
| Statistics |
Gene records
30747
Genes located to the genome
15045
Genes not located to the genome
15702
Alleles
105377
Alleles of located genes
86520
Alleles of unlocated genes
18857
Aberrations
18109
Deficiencies
7590
Deficiencies with mapped endpoints
1491
Transposable element insertions
90619
Insertions mapped to the sequence
42791
|
| Annotation Release 5.9 | |
| Summary of changes from previous release |
New Gene Models
3
Restored Gene Models
0
Deleted Gene Models
65
Merged Gene Models
71 -> 31
Split Gene Models
8 -> 18
Complex combinations
0
|
Annotated Gene Models
|
|
|
Annotated Gene Models
Count
Avg. size
Longest
Shortest
Change
Genes
15045
5484
258567
19
-100
Protein coding genes
14008
5840
258567
132
-100
Protein coding transcripts
21064
2368
69439
132
-6
Exons
68370
479
27725
1
-139
Introns
50836
1385
166135
17
-60
5' untranslated regions
19086
184
3391
1
124
3' untranslated regions
13396
374
5684
1
125
Unique polypeptides
18104
583
22971
25
-53
rRNA genes
161
504
6026
123
0
rRNA
161
504
6026
123
0
tRNA genes
314
75
186
61
0
tRNA
314
73
87
61
0
snRNA genes
47
115
275
36
0
snRNA
47
115
275
36
0
snoRNA genes
249
113
316
46
0
snoRNA
249
113
316
46
0
miRNA genes
90
24
100
19
0
miRNA
90
24
100
19
0
Miscellaneous non-coding RNA genes
87
3042
31065
31
0
Miscellaneous non-coding RNA
104
1185
14084
31
0
Pseudogenes
88
3217
179585
53
0
Transposable elements present in the sequenced strain
5552
1507
66001
23
0
Annotated repeat regions
10159
|
|
Other Annotated Gene Features
|
|
| Mapped Nucleotide Changes | |
|
Annotated Gene Features
Count
Change
total mapped nucleotide changes
3583
0
aberration junction
193
0
complex substitution
52
0
deletion
225
0
insertion site
48
0
point mutation
2804
0
sequence variant
204
0
TE target site duplication
40
0
uncharacterized change in nucleotide sequence
17
0
|
|
| Mapped Regulatory Elements | |
|
Annotated Gene Features
Count
Change
total mapped regulatory elements
2319
0
enhancer
22
0
poly A site
98
0
protein binding site
1396
0
regulatory region
240
0
rescue fragment
563
0
|
|
| Mapped Reagent Features | |
|
Annotated Gene Features
Count
Change
transposable element insertion site
42791
50
microarray amplicons
14095
0
dsRNA amplicons
67381
0
BAC
973
0
oligonucleotide
583294
0
|
|
Aligned Evidence Features
|
|
| Nucleotide Alignments | |
|
Annotated Gene Features
Algorithm
Count
Change
D. melanogaster cDNA inserts
sim4tandem,splign
16159
0
D. melanogaster EST
sim4,splign
483268
0
Other melanogaster DNA sequences
sim4tandem,splign
12707
13
|
|
| Gene Predictions | |
|
Annotated Gene Features
Algorithm
Count
Change
Augustus prediction
Augustus 1.0
12292
0
BATZ Contrast
CONTRAST
14219
0
BATZ Contrast NA
CONTRAST
13589
0
CONGO exons
CONGO
40544
0
DGIL snap
SNAP
19640
0
DGIL snap homology
SNAP
22949
0
Genie prediction
Genie v2.2/flyGenie
11248
0
Genscan prediction
Genscan 1.0
18909
0
NCBI gnomon
GNOMON
19729
0
RGUI geneid
GENEID 1.2
12389
0
RGUI geneid u12
GENEID 1.2
12717
0
|
|
| Proteins Aligned | |
|
Annotated Gene Features
Algorithm
Count
Change
D. melanogaster proteins
WU-blastx 2.0, Prosplign
30522
22312
Other insect proteins
WU-blastx 2.0
7076
1881
Nematode proteins
WU-blastx 2.0
6361
0
Yeast proteins
WU-blastx 2.0
2170
0
Plant proteins
WU-blastx 2.0
8396
0
Rodent proteins
WU-blastx 2.0
14824
0
Primate proteins
WU-blastx 2.0
13691
0
Other invertebrate proteins
WU-blastx 2.0
13070
24
Other vertebrate proteins
WU-blastx 2.0
10443
0
Other proteins
Prosplign
27576
837
|
|
| Translated Nucleotide Alignments | |
|
Annotated Gene Features
Algorithm
Insect ESTs
WU-tblastx 2.0
A. gambiae genomic
WU-tblastx 2.0
D. pseudoobscura genomic
WU-tblastx 2.0
|
|
Release 5 Sequence Assembly
|
|
|
Date updated: 12-FEB-2008 TABLE 1: Overview of the Release_5 Assembly (from BDGP)
Current Release
Release 5
Sequencing Center
Primary Center: BDGP
Collaborating Centers: DHGP, BCM-HGSC, Celera
Sequenced Strain
y1; cn1 bw1 sp1
Flies available:
Bloomington: 2057 DGRC-Kyoto: 106641 Tucson: 14021-0231.36 BAC clones available: CHORI BPRC Date Released:
Finished Arms
See Table 2 for arm accessions
18-APR-2006
WGS Heterochromatin
AABU01000000
20-NOV-2007
TABLE 2: Detailed Information on the Release_5 Assembly
Scaffold
Length (bp)
Internal Gaps
(see Table 3)
GenBank
Accession
Major Difference Compared to Release 4 (Arm Scaffolds Only)
ArmX
22,422,827
3
AE014298
8kb added to the distal end.
Gaps filled in regions 1-11.
Arm2L
23,011,544
2
AE014134
591kb added to the proximal end of the arm.
Arm2R
21,146,708
1
AE013599
380kb added to the proximal end.
Arm3L
24,543,557
1
AE014296
16kb added on distal end.
718kb added to proximal end. Other gaps filled
Arm3R
27,905,053
0
AE014297
None.
Arm4
1,351,857
1
AE014135
70kbp added to the distal end.
XHet
204,112
0
CM000460
WGS
YHet
347,038
Draft
CM000461
WGS - Unordered
2LHet
368,872
2
CM000456
WGS
2RHet
3,288,761
Draft
CM000457
WGS - Unordered
3LHet
2,555,491
Draft
CM000458
WGS - Unordered
3RHet
2,517,507
Draft
CM000459
WGS - Unordered
ArmUn
10,049,037
Draft
FA000001
WGS - Unordered
TABLE 3: Known Gaps in the Release 5 Major Arms Assembly
Scaffold
GenBank
Accession
Gaps
Notes
Gene Association
ArmX
AE014298
111523..129522
sized
ArmX
AE014298
21684450..21684549
unsized
ArmX
AE014298
21687344..21759343
sized
Arm2L
AE014134
21485539..21485638
unsized
Histone Gene Cluster *
Arm2L
AE014134
22420242..22420341
unsized
Arm2R
AE013599
16668213..16668312
unsized
Arm3L
AE014296
5107767..5107866
unsized
Arm3R
AE014297
None
--
Arm4
AE014135
1221289..1221388
unsized
* 20 copies of the ca. 100 copies of the 5kb histone repeat unit are present in the Release 5 AE014134 scaffold.
TABLE 4: Distribution of Centric Heterochromatin in Release 5 Scaffolds Locations of the euchromatin/heterochromatin boundaries on the five major chromosome arm scaffolds has been determined by the DHGP.
Chr.
Arm
Major Arm Scaffold GenBank
Accessions & Seq. Coordinates
Heterochromatin Scaffold GenBank
Accessions & Seq. Coordinates
X
| |
General FlyBase
FB2008_06
Statistics